Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization
نویسنده
چکیده
The problem of learning using connectionist networks, in which network connection strengths are modified systematically so that the response of the network increasingly approximates the desired response can be structured as an optimization problem. The widely used back propagation method of connectionist learning [19, 21, 18] is set in the context of nonlinear optimization. In this framework, the issues of stability, convergence and parallelism are considered. As a form of gradient descent with fixed step size, back propagation is known to be unstable, which is illustrated using Rosenbrock's function. This is contrasted with stable methods which involve a line search in the gradient direction. The convergence criterion for connectionist problems involving binary functions is discussed relative to the behavior of gradient descent in the vicinity of local minima. A minimax criterion is compared with the least squares criterion. The contribution of the momentum term [19, 18] to more rapid convergence is interpreted relative to the geometry of the weight space. It is shown that in plateau regions of relatively constant gradient, the momentum term acts to increase the step size by a factor of 1/1-μ, where μ is the momentum term. In valley regions with steep sides, the momentum constant acts to focus the search direction toward the local minimum by averaging oscillations in the gradient. Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MSCIS-88-62. This technical report is available at ScholarlyCommons: http://repository.upenn.edu/cis_reports/597 LEARNING ALGORITHMS FOR CONNECTIONIST NETWORKS: APPLIED GRADIENT METHODS OF NONLINEAR OPTIMIZATION
منابع مشابه
GRADSIM: A Connectionist Network Simulator Using Gradient Optimization Techniques
A simulator for connectionist networks which uses gradient methods of nonlinear optimization for network learning is described. The simulator (GRADSIM) was designed for temporal flow model connectionist networks. The complete gradient is computed for networks of general connectivity, including recurrent links. The simulator is written in C, uses simple network and data descriptors for flexibili...
متن کاملA Hybrid Optimization Algorithm for Learning Deep Models
Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...
متن کاملA Hybrid Optimization Algorithm for Learning Deep Models
Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...
متن کاملA New Strategy for Training RBF Network with Applications to Nonlinear Integral Equations
A new learning strategy is proposed for training of radial basis functions (RBF) network. We apply two different local optimization methods to update the output weights in training process, the gradient method and a combination of the gradient and Newton methods. Numerical results obtained in solving nonlinear integral equations show the excellent performance of the combined gradient method in ...
متن کاملConvergence Analysis of Gradient Descent Algorithms with Proportional Updates
The rise of deep learning in recent years has brought with it increasingly clever optimization methods to deal with complex, non-linear loss functions [13]. These methods are often designed with convex optimization in mind, but have been shown to work well in practice even for the highly non-convex optimization associated with neural networks. However, one significant drawback of these methods ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015